Selected Prior Research
نویسنده
چکیده
• 1996 scaled tree-based classifiers to very large data sets. A fundamental challenge in data mining is to mine data sets that are so large that they do not fit into a computer’s memory. This is important for a wide variety of applications ranging from homeland defense to identifying fraudulent credit card transactions. One of the most accurate techniques in data mining is tree-based classifiers and predictors. Our 1996 paper [16] described a method for computing tree-based classifiers on data sets that are too large to fit into a computer’s memory. The first idea is to partition the data, build individual trees on each partition, and then combine the trees using an ensemble, or collection, of classifiers. The second idea is to use stratified sampling to oversample rare events and distribute them over the various partitions. This was essentially a variant of a type of sampling called bootstrapping. This technique was implemented in Magnify’s 1996 version of the PATTERN data mining system and was called Averaged Classification Trees/Averaged Regression Trees or ACT/ART. PATTERN was the first data mining system to build very accurate classifiers on data sets that could not fit into a computer’s memory, allowing classifiers in 1996 to be built on terabyte size data sets when memory was measured in megabytes and disks in gigabytes. The 1996 paper by Breiman [1] presented a complementary idea called bagging in which ensembles of trees are built over small data sets by repeated sampling with replacement (another variant of bootstrapping). Building ensembles of trees via partitioning and appropriate bootstrapping is still considered by many to be the most effective algorithm for detecting rare events in large data sets.
منابع مشابه
An Educational model of Creativity Enhancement in Design Studios Using Prior Researches
Despite a large body of research on creativity in architecture, the concept of creativity as a multi-faceted phenomenon in design studios is still challenging. The present study aims to analyze the related literature and systematically categorize them to provide a conceptual framework to enhance creativity in design studios. By using a qualitative researcher method along with Sandelowski and Ba...
متن کاملHOW DO EMPLOYERS’ 401(k) MUTUAL FUND SELECTIONS AFFECT PERFORMANCE?
Defined contribution plans, predominantly 401(k)s, are the primary source of personal retirement savings for American workers, making the investment decisions within these accounts a salient policy concern. These decisions are a result of two separate actions: the mutual fund options selected by the employer’s plan administrator and the specific funds chosen by the participant. While considerab...
متن کاملChanges in dynamic exercise performance following a sequence of preconditioning isometric muscle actions.
Complex training is the method of coupling heavy and light loads into an organized sequence with the aim of facilitating postactivation potentiation. Anecdotal evidence has supported the use of complex training sequences, but scientific studies investigating the effects of sequencing isometric loads with dynamic muscle actions have been limited. The purpose of this study was to examine the effe...
متن کاملToward a Deeper Understanding of the Technology Acceptance Model: An Integrative Analysis of TAM
Generally speaking, technology acceptance model (TAM, Davis, et al., 1989; Davis, 1989) is a successful model and prior studies have explored the TAM from different perspectives and many antecedents have been identified. However, the overview of TAM shows that the explanatory power of TAM is limited. Furthermore, more and more inconsistencies in results appeared along with the development of TA...
متن کاملNovice and Expert Teachers’ Conceptions of Learners’ Prior Knowledge
This study presents comparative case studies of preservice and first-year teachers’ and expert teachers’ conceptions of the concept of prior knowledge. Kelly’s (The Psychology of Personal Construct, New York: W.W. Norton, 1955) theory of personal constructs as discussed by Akerson, Flick, and Lederman (Journal of Research in Science Teaching, 2000, 37, 363–385) in relationship to prior knowledg...
متن کاملOvercoming Methodological Concerns in the Investigation of Online Sexual Activities
Online Sexual Activity (OSA) is an important and growing phenomenon. Prior research in this area has been criticized on methodological grounds. This study examines the reliability of Internet research regarding online sexual activities by comparing a selected random sample to a convenience sample. Participation in the selected random sample was limited to every 1,000th visitor to the MSNBC webs...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006